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ABSTRACT 



Context. Longtime monitoring of gravitational lens systems is often done using telescopes and recording equipment with modest 
resolution. Still, it would be interesting to get as much information as possible from the measured lightcurves. From high resolution 
images we know that the recorded quasar images are often blends and that the corresponding time series are not pure shifted replicas 
of the source variability. 

Aims. In this paper we will develop an algorithm to unscramble this kind of blended data. 

Methods. The proposed method is based on a simple idea. We use one of the photometric curves, which is supposedly a simple shifted 
replica of the source curve, to build different artificial combined curves. Then we compare these artificial curves with the blended 
curves. Proper solutions for a full set of time delays are then obtained by varying free input parameters and estimating statistical 
distances between the artificial and blended curves. 

Results. We performed a check of feasibility and applicability of the new algorithm. For numerically generated data sets the time delay 
systems were recovered for a wide range of setups. Application of the new algorithm to the classical double quasar QSO 0957+561 
A,B lightcurves shows a clear splitting of one of the images. This is an unexpected result and extremely interesting, especially in the 
context of the recent controversy about the exact time delay value for the system. 

Conclusions. The proposed method allows us to properly analyse the data from low resolution observations that have long time 
coverages. There are a number of gravitational lens monitoring programmes that can make use of the new algorithm. 

Key words. Cosmology: observations - Gravitational lensing - Methods: statistical 



1. Introduction 

Gravitationally lensed quasar images can be monitored for a long 
period of time. The obtained time series can then be used to es- 
timate time delays for different light paths to the observer. The 
best example of this kind of long-term photometry is a time se- 
ries obtained by R. Schild and collaborators (see for instance 
■ Schild et al. [13971 for the ubiquitous system QSO 0957+561. 
k> \ Typically, such long programmes can be carried out only on 
j_J ■ small telescopes and consequently they will have a modest res- 
' olution. For many interesting multiply lensed systems, some of 
the images may remain unresolved. It occurs that for a reason- 
ably long time series it is possible to recover complex time delay 
systems even from the blended images. 

There are a number of different time delay estimation 
methods used by various research groups. For a short re- 
view of the popular methods see for instance Kundic et 
al. (1997 ). Some recent more peculiar approaches can be found 
in Hjorth ( fl992l Pijper s ( fl997l >, Barkana ( fT9971 , Burud d200TT) . 
and Gil-Merino 020021 ). These methods can roughly be divided 
into three classes: 

- cross-correlation based methods; 

- methods based on interpolation (linear, polynomial, spline, 
etc.); 

- methods which use dispersion spectra. 

In principle, nearly all the known methods can be reformulated 
so that they can handle more complex models of time series, 
such as those treated in the current paper. Because of the simplic- 
ity and familiarity of the last class of methods, the new methods 
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described hereafter are also based on the computing of disper- 
sion spectra, introduced in Pelt et al. ( 1994) and refined in Pelt 
et al. ( [19961 ). 

In Pelt et al. Q998 ), dispersion analysis was generalised for 
systems with multiple images. However, in previous papers and 
algorithms it was always assumed that observed curves are time- 
shifted replicas (perhaps distorted by microlensing) of the source 
variability. In practice, especially if we deal with low resolution 
photometry, this is not always the case. Often the photometric 
aperture covers several weak images and what we effectively get 
is the blend of different lightcurves. From physical considera- 
tions we can predict that the components of the blends are cer- 
tain weighted sums of the time-shifted source variability curves. 
Typically, the time delays between blended components are re- 
markably shorter than the delays between images, which are 
significantly separated. A singular case of blending, where we 
observe only the sum of multiple components, was analysed in 
Geiger (1996). In this method a certain amount of extra informa- 
tion is used from additional interferometric observations. 

Our paper is organised as follows. First we introduce gen- 
eral ideas of the proposed new algorithms. Then we formulate 
the algorithms in the terms of the specific time series operations 
involved. In the next part we describe the numerical tests we 
made to evaluate the algorithms. Then we consider applicability 
of the method in different contexts and apply the method to a real 
observed data sequence - the classical system QSO 0957+561 
A,B. It is well known (from radio observations and deep images) 
that the photometric sequences for this system are not blends. 
However, as will be shown below, one of the observed curves 
can be disentangled into two quite clearly shifted source curves. 
Why it is so we do not know yet, but it is not ruled out that the 
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well-known "small delay difference" controversy may be con- 
nected with this effect (see Yonehara 1999; Gil-Merino et al. 
12001 ; Goicoechea 2002; Yonehara et al. 2003 ). Finally, we give 
a few recommendations for observers on how to plan long mon- 
itoring programmes for not fully resolved images. 

2. Method 

Here we use a slightly different formalism to introduce the new 
method (cf. original introduction of dispersion spectra in Pelt et 
al. ([T994 1996 )). In general, the source quasar image in a grav- 
itational lens system is split due to the gravitational field of the 
intervening galaxy into multiple images /t, . . . ,/k- The source 
variability q(t) shows itself in the measured lightcurves. Because 
of different flight paths the total flight times tk, k = l,.,.,K dif- 
fer. Consequently, the observed luminosities can be described as 
shifted (and possibly magnified) functions of the source variabil- 
ity fk(t) - auq{t - t/t). For a fully resolved case we will have in 
total K continuous curves /(f). This somewhat oversimplified 
model ignores microlensing effects (variability of the magnifica- 
tion coefficients in time) and also other possible distortions. 
For each pair of images fufj we have the corresponding time 
delay Atjj = tj - ?,-. From N(N - l)/2 delays only N - 1 can 
be considered as independent. If we know all these independent 
delays, we can talk about a full set of time delays. 

2. 1 . Two fully resolved lightcurves 

Let us have two unblended images f\{t) = a\q(t - 1\) and f 2 (t) — 
a 2 q(t - to). We can shift the second curve by a time delay At 
and multiply it by an arbitrary magnification ratio a to form a 
difference d( f) 

d(t,At,a)-fi(t)-af 2 (t + At). (1) 

If it happens that At - At\ t2 = t 2 — t\ and a — a\ja 2 then the dif- 
ference vanishes, d(t, At, a) = 0, and we say that the two curves 
match each other. In the case of noisy curves the match cannot be 
perfect. However, we expect that the dispersion of the difference 
is minimised for proper parameter values. 

Finally, for discrete measurement series, we can hope that 
for each set of trial parameter values there are at least some 
time point pairs that can be used to estimate the dispersion. This 
dispersion is used then as a merit function to compare different 
combinations of parameters. This simple example shows that we 
can start from continuous curve models to build (using free pa- 
rameters) matching pairs and then translate our algorithm into 
language of dispersion spectra. For the details of the method see 
Pelt (fli96l 

2.2. More than two fully resolved lightcurves 

Sometimes we may have more than two fully resolved images 
and corresponding lightcurves (see Pelt |1998b . Just for simplic- 
ity let us look at a case when we have three images f\(t) = 
ci\q(t - t\), /2(f) = a 2 q(t _ h), and /3(f) = ajq(t - ti). Now we 
have three amplifications and three time delays from which only 
two are independent. There are also three different possibilities 
to match curves. As shown in Pelt i 19981 ) it is reasonable to add 
dispersions from all three matches. This allows us to use infor- 
mation maximally in the sampled and noisy curves. The number 
of independent variables in this case is four - two delays and two 
amplification ratios. 



2.3. A clean image and a blend 

For three images we have basically only one interesting scheme 
with blended images. In this case we can observe one pure im- 
age, say /i(f) and one blend /(f) = f 2 (t) + f 3 (t) = a 2 q(t - t 2 ) + 
aj,q(t - t-i). These curves cannot be matched. However, we can 
use a "clean" curve f\ (f) = a\q{t - t\) to build an artificial blend 
curve g(t) = fi(t)+afi(t-A s ), where a and A s are additional free 
parameters to be determined. In the terms of the source curve we 
get g(t) - a\q(t - t\) + aa\q(t - 1\ - A s ). It is not hard to see that 
in a fortunate case when A, = Af2,3 and a - aj/a 2 , the curves 
/(f) and g(t) are shifted and amplified versions of each other and 
can be matched. Then we will have three parameters to vary: the 
two artificial blend parameters a, A s and the matching parameter 
A/ = Afi 2 = t 2 - 1\ . In the matching process we also estimate (for 
every parameter triple) two additional regression parameters: a 
and b, which are discussed briefly in Sect. 12.51 As we see, in ad- 
dition to subtraction (to match curves), we must also know how 
to add discretely sampled curves to build artificial blends. 

2.4. Two blends 

If we have four images, then there are many possibilities. The 
unblended scheme can be treated similarly to the case of three 
images. The only difference is that the combined dispersion has 
to be estimated, using the sum of 6 pairwise matches. The num- 
ber of free parameters is 6 and the search grid can be quite large. 
However, it is sometimes possible to reduce the number of free 
magnification ratios. For the case of one blend and two clean 
images we can combine clean images into an artificial blend 
and then compare the combined curve with the measured blend 
curve. 

The most interesting case is where we have only two blends, 
say gi(t) = /(f) + f 2 (t) and g 2 (t) = /(f) + / 4 (f). It is not possi- 
ble to build proper matches in this general case, but very often 
the blend components (in both blends) are of nearly equal lumi- 
nosity. In this case we can form two artificial blends, one from 
the first observed blend curve and another one from the second 
curve. Now we can estimate values for input parameters using 
the dispersion of the differences between two artificial blend 
curves. If it happens that the delay we used to build the second 
artificial blend is just the delay between / and / and vice versa, 
that is, if the delay with which we built the first artificial blend is 
equal to the delay between / and f\, then these artificial blends 
are shifted versions of each other. (The delay between the artifi- 
cial blends is the third time delay to be varied.) In a similar way 
we can proceed further to more complex systems. Unfortunately, 
the parameter space is becoming untreatably large for such sys- 
tems, although we can always choose some image subsets and 
then apply one of the simpler schemes. 

2.5. Subtraction of sampled curves 

Till now we presented possible algorithms in the language of 
continuous lightcurves. The real data is always sampled, and 
consequently we need to reformulate our algorithms for this 
case. For every pair of input data tables f„, /„, W„, n = 1,2,. . .,N 
and t,„, g m ,W m ,m = 1,2, ...,M (where W„ = l/(A/„) 2 and 
W m = 1/ (Ag m ) 2 are statistical weights computed from standard 
errors Af„, Ag m given by the observer), we can define their "sta- 
tistical" difference or distance between two curves. 

To be consistent with actual codes implementing the pro- 
posed methods, we assume that the first curve can be amplified 
by a certain coefficient a and it can have a different baseline 
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value b. For a particular set of input parameters we can form 
a table of triples: 

^ , (flf„ + b- gm) 2 , W„j„, (2) 

where W, hm are the statistical weights for every row. The actual 
values for the a and b parameters are to be estimated using the 
least-squares routine, and they are always calculated for every 
set of a, Aj, and A; in our method. All the rows in this table are 
not equally significant. If it happens that t„ = t m , then we can 
assign a full weight to the corresponding row. But if the time 
difference between the two points is quite large (say larger than 
a certain pregiven value cr), then comparing the values for dif- 
ferent curves does not make sense. Following these heuristics we 
introduce the following downweighting function: 



S n,m — 



1 - 

0, 



if \t„ - t m \ < cr, 
if \t„ - t m \ > cr 



(3) 



Finally, the combined statistical weights for every row in the ta- 
ble of squared differences (Eq.[2ji can be written as: 




5100 5150 5200 5250 

Time (days) 



Fig. 1. Combining an original time series with its shifted version. 
Depending on the spacing of time points and the downweighting 
parameter cr, the resulting series can be sparser (middle part of 
the series) or denser (right part). The combined error estimates 
are larger than the original (given) values. 



W n + a 2 W„, 



(4) 



The normalised estimator of the dispersion of the difference be- 
tween the two curves is now 



D 2 = min 

a,b 



Imjn( a fn +b- g m ) 2 W n ,, 



(5) 



and we may call it statistical distance. From the point of view 
of the parameter estimation scheme, it is also a merit function to 
compare different sets of parameters. 

Because one of the parameters we search for, a, is included 
in the weight system, the minimisation proceeds iteratively. We 
first fix a = 1, and compute the weights using this value. Then 
we use a standard weighted least-squares routine to estimate 
both the parameters a and b. By inserting the estimated a back 
into the weights, we can proceed iteratively until convergence is 
achieved. Fortunately, only a small number (about four for 0.1% 
precision) of iterations is needed. 

In some cases the parameter combination that minimises the 
statistical distance can be unphysical. For particular shifts one 
curve can approximately match the mirror image of the other, 
and then the a value can be negative. The distance computation 
procedure must take this possibility into account and mark un- 
physical parameter combinations. It must be said that the de- 
scribed "statistical" subtraction procedure is quite general. Input 
data sets can be original data tables, data with shifted time argu- 
ments, or artificial blends computed from input data by adding 
time-shifted variants of it. 



2.6. Addition of sampled curves 

We start again from two input tables t n ,f n ,W„,n = 1,2, ...,N 
and t m ,g m , W m , m = 1 , 2, . . . , M. Now we must compute a dis- 
crete analogue to the combined curve /(f) + ag(t). Using similar 
considerations as above for the case of subtraction, we can form 
the triples 



tn tn . 



-,/„ +ag m ,W„ 
where the combined weights 



a 2 W„ + W„, 



(6) 



(7) 



consist of appropriately propagated weights and downweighting 
function. 

The total number of selected triples (with non-zero weights) 
depends on the downweighting parameter cr. In Fig. Q] we have 
added an original time series (lower part of the figure) and its 
shifted version to form a combined curve (upper part of the fig- 
ure). In the case of a dense sampling, our constructed data set is 
quite redundant, especially for larger values of cr. If the sampling 
step is smaller than cr, we may get sparser time series as well. It 
is very hard to choose a proper value for cr from purely theoret- 
ical considerations. However, the proper range of usable values 
can be established by using model or trial calculations. The sets 
of triples Eq. |2]or Eq. |6]can be looked upon as a new input data 
set for further operations. Combining shifting in time and the 
adding and subtracting of discrete lightcurves, we can now build 
different discrete models for algorithms described above for the 
continuous case. 



3. Time delays from a clean curve and a single 
blend 

As shown in the previous section, there are a number of possi- 
bilities to build interesting algorithms, which allow us to search 
for multiple time delays from blended lightcurves. To be spe- 
cific, we restrict ourselves to only one particular scheme, where 
we have two observed curves: a clean curve C with a flight time 
1 1 and a blend A — A 1 + A2, where the flight time for A 1 is to 
and the flight time for A2 is t^. We use the clean curve to build 
artificial blends B(f) = C{t) + aC(t-At2^) and compare the artifi- 
cial blends with the observed blend A. The additional parameter 
- the magnification factor a - takes into account the possibility 
that the two source images A 1 and A2 have different amplifica- 
tions. For every set of trial delays Af 12, A?2 ; 3 and the factor a we 
statistically subtract two blends and evaluate their distance D 2 . 

A global search for the best parameter combination consists 
of computing the statistical distances for a large grid of parame- 
ter triads. For some of the triads the distance computation can 
reveal unphysical matches. These combinations are discarded 
and are also marked as special in the corresponding multidimen- 
sional plots. The best parameter combination corresponds to the 
global minimum of D 2 and is searched for only among physi- 
cally plausible solutions. 
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Fig. 2. The two-dimensional grid of merit function values for a 
computer generated random walk and a blend computed from it. 
The general minimum must indicate the true pair of long and 
short delay values. The plot demonstrates degeneracy in full- 
scale computations well - there is obvious symmetry between 
the areas for positive and negative values of short delays. Values 
on the colour key represent the log(D 2 ) and spacing of the con- 
tours. The same type of colour key is used in all two-dimensional 
plots. 



There is one interesting aspect in this global search proce- 
dure - it is essentially degenerate. The degeneracy comes from 
the fact that the long and short delays between the blend compo- 
nents can be computed differently. The short delay depends on 
how we assign names to hypothetical parts of the blend. In one 
case the delay is ti - ti, but in another case ti - f3. And corre- 
sponding long delays will be also different: t^—t] and fa — 1\ , This 
degeneracy results in symmetrically placed minima on the grid 
of the time delays (see for instance Fig. [2j». For finite sequences 
both solutions can give slightly different values for the merit 
function because of the boundary effects. Sometimes physical 
considerations can define the proper order of total flight times 
and then we do not need to compute full grids, but can restrict 
our computations to only one half of them. 

It is also important to check how the actual distribution of 
time moments in input data sets influences the statistical weights 
in the final expression for dispersion. Even in the case of the 
comparison of two pure (one of them shifted) lightcurves, it 
is not ruled out that for a particular time delay the observa- 
tions of one curve occur just in the gaps of the other. The well- 
known controversy on the time delay of the classical double 
quasar QSO 0957+561 was just a result of this kind of accident 
(see for details Press et al. |1992al[l992bl and Pelt et al. [19941 . 
Multidimensional graphs of the parameter dependent sums of 
weights from Eq. [5] can reveal regions where there is not suffi- 
cient information to estimate the parameters of the model. 



4. Tests with simulated data sets 

Our goal in this paper is to show that the approach described 
above can be useful - if not generally - then in many interesting 
situations. We do not have observed data sets for real systems 
at our disposal that are long enough and have been measured 
for blended systems. Thus we have to use computer-generated 
models. We hope that availability of the new method gives an 
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Fig. 3. A basic computer-generated random walk C (lower 
curve) and a computer generated blend curve A (A/ = 50.2, 
A s = 10.6, the amplification parameter a — 1.3; we added 5% 
noise to both curves and shifted the blend up by 60 units). 



extra motivation for astronomers observing at telescopes with 
modest resolutions to carry out long monitoring programmes for 
gravitational lens systems. 

The generation of simulated data is simplified by the fact 
that model curves for different images can be computed from the 
same source curve. We can apply different shifts to time points 
of a given or generated sampling scheme, then interpolate values 
from the source curve at shifted positions, and finally combine 
the obtained values into blends. To achieve an appropriate degree 
of realism, we often used time points and weighting sequences 
from real observations. This allowed us to analyse very irregular 
and inhomogeneous distributions with caps. 

In most cases we used a simple random walk procedure to 
generate the source variability curves. The time steps for the 
curves were selected according to two principles: they must be 
shorter than typical sampling intervals and they must be longer 
than typical photometric integration times. A random value of 
+ 1.0 was assigned cumulatively to each step in the intensity 
scale. If we wanted to model actual observations (especially to 
use their statistical weight systems), then the resulting curves 
were appropriately scaled. One of the generated curves and the 
blend constructed from it is shown in Fig. [3] 

It was interesting to observe that sometimes the generated 
curve was quite poor in features (minima and maxima, etc.). In 
these cases we discarded them. There is a similar effect when 
dealing with actual lens systems. A quasar can be "quiet" for 
a long time and its photometry is not sufficient for time delay 
estimation. 



4. 1 . Random sampling 

We started our analysis with a simple sampling scheme where 
the initial time points were generated by using random step sizes 
from the interval [0.2, 1.8] days. The generated time points were 
then used to read off (using appropriate time shifts and linear 
interpolation) three different data sequences for the images C, 
Al, and A2. From the two last images we formed the blend 
A = A 1 + aA2 with a fixed amplification ratio a. In a particular 
example displayed in Figs.|2]-|4l the following parameters were 
used: the "long" time delay between C and A 1 was A; = 50.2, 
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Fig. 4. A classical dispersion spectrum computed for the com- 
puter generated curve C and the artificial blend A. It reveals a 
shift Af ~ 59 days. However, the blend was generated by us- 
ing a long delay value 50.2 days and a short delay 10.6 days. 
Consequently, blending can mask proper time delay values. The 
fully resolved case is shown in Fig. [2] 



the "short" delay between the blend components was A s = 10.6, 
and the amplification parameter a - 1.3. To take into account 
daylight and randomly changing observational conditions, part 
of the time points were discarded. A typical run of our scheme is 
given in Fig. [3] From the initially generated 734 sample points, 
only 306 were left in the time series. Finally, we added a random 
5% Gaussian noise component to the curves C and A. 

The two resulting model curves C and A can be used as input 
for a standard time delay estimation procedure. As seen from 
Fig. [4] in this example there is a quite well pronounced global 
minimum in the dispersion curve, but its position is at Af « 59 
days. Blending can move dispersion minima from one place to 
another. 

To recover both delays, we need to apply our new method 
with artificial blends. First we form a three-parameter search grid 
[-360,360: 1.0]x[-90,90: 1. 0]x [0.6, 1.6 : 0.1 fl and compute 
the merit function D 2 values for each grid point. The value for 
downweighting parameter <x was taken at 2.5 days. (See Sect. [6] 
for discussion on choosing the value of parameter <x.) The pa- 
rameter combination for the minimum value of the merit func- 
tion is then selected as the solution. In the current example the 
best triple occurred at A/ = 50 days, A s - 11 days, and a — 1.1. 
The two-dimensional slice at a — 1 . 1 of the search grid is given 
in Fig. 12 In this plot we can clearly see the degenerate character 
of our procedure. Depending on the ordering of the blend com- 
ponents, we can attain the deepest minima in two symmetrically 
placed areas. Formally, both global minima are of equal value, 
but typically the actual merit function values for both minima 
differ slightly. If we do not have any additional information (say 
from deep lens system images) we can just formally select the 
strongest minimum. Another interesting point of the current ex- 
ample is the fact that our algorithm did not exactly recover the 
amplification ratio parameter (we found 1 . 1 instead of 1.3). This 
is quite typical - for every particular pair of delay values the 
merit function dependence on the parameter a is quite weak and 
the corresponding curve has a wide minimum around the correct 
value. 



Here and below we use a systematic notation for search grids. 
Inside the square brackets we give the minimum and maximum values 
for the parameter in question, followed by the grid step. 
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Fig. 5. A random walk C (lower curve) and a blend A (upward- 
shifted curve) with real sampling. 



For high quality data (with small errors) we can look for a fi- 
nal solution with higher precision. For each strong minima found 
during the rough analysis, we can build refined local parameter 
grids in the vicinities of the preliminary solutions. An example 
of such local refinement is given in the next subsection. 

4.2. Real sampling 

To test our method under real sampling conditions, the time 
points from the observational data by SchilcQ were used to sam- 
ple the C and A curves built from a computer-generated random 
walk pattern. The standard errors given by Schild were scaled 
according to the ratio of the full amplitudes of the real and simu- 
lated data. Next these scaled errors were used as standard devia- 
tions for Gaussian noise components, which were added to each 
point of simulated data. The resulting curves for one particular 
run are shown in Fig. [5] 

The values of parameters used in this simulation were: A; = 
420.15 days, A s = 20.21 days, a = 0.8. We selected a 4202 
day time-interval from Schild's observations and the number of 
data points in the input table was 1032. A crude search grid 
[-150,1000: 1.0] x [-200, 200: 1.0] x [0.6, 1.6 : 0.1] was used 
to estimate the amplification parameter or. The grid slice with the 
best value a = 0.9 was then used to refine other two parameters. 
(Here and in the next section the value for the downweighting 
parameter <x was taken at 7 days.) Finally we got the estimates 
for the delays A/ = 420.2 days, A s = 20.0 days. The plot of merit 
function values in the slice [-150, 1000 : 0.1]x[-200,200 : 0.1] 
is shown in Fig. [6] Again, we can see the two symmetrically 
placed minima. 

To see how observational errors would affect the results, we 
performed an additional experiment in the vicinity of the ob- 
tained solution. We added gradually varying levels of Gaussian 
noise to the simulated data and evaluated merit function values 
for the slice [360,480 : 0.1] x [-40,40 : 0.1] (a = 0.9). The 
corresponding plots for the four noise levels are given in Fig. [7] 
It is clearly seen how the minima are smeared out when the noise 
level rises. From TableQ]we can see how the global solution be- 
haves when noise is added. As was mentioned above, sometimes 
the solution can jump to its mirror place (and in this case we 

2 http ://cf a- w w w. harvard. edu/~rschild/f ulldata2 . txt 
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Fig. 6. Merit function values for simulated data and real sam- 
pling. 



: : : : ! ! 

P.S9 0.34 U-4fl Ml ».7S 0B9 1.05 118 1.33 1A3 1-57 1.70 f.BJ 1.9* E.11 




Fig. 7. Vanishing of the characteristic minima of the merit func- 
tion values due to observational errors. The noise levels are 
(clockwise from upper left): 0%, 2%, 5%, and 10%. 



essentially do not lose it). For higher noise values we can com- 
pletely lose the correct solution. When planning for a long time 
photometric monitoring it is reasonable to establish the appro- 
priate levels of measurement precision. For that purpose similar 
model calculations as carried out in this section can be useful. 

4.3. Error estimation 

To test statistical properties of the new method we used Monte 
Carlo type calculations. We added appropriately scaled Gaussian 
noise components to the noise-free model curves from the pre- 
vious section, so that the expected signal-to-noise ratios were 
the same as those for Schild's data. Using Estonian GRID0 re- 
sources, we repeated this 3500 times and stored the obtained op- 
timal parameter triples. From this resulting table we calculated 
the average values and standard deviations: A/ = 419.6 + 0.8 
and A, = 20.14 + 1.22 days, a = 0.94 + 0.10. The resulting bi- 



3 See http://grid.eenet.ee/en/ 



Table 1. Location of global minima depending on added 
Gaussian noise. 
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Fig. 8. Merit function values for the actual double quasar data. It 
differs significantly from Fig. [9] 

ases 419.6 - 420.15, 20.14 - 20.21 and standard errors for this 
particular experiment setup are reasonably small. 

5. Results for a real system 

To evaluate the new method in a more realistic context, we used 
the master data set for the double quasar QSO 0957+561 A,B 
kindly provided by Rudy Schild (6806 days, 1233 time points, 
R-band optical CCD photometry). As far as we know, the com- 
ponents of the system itself cannot be considered as blends and 
consequently we used this data set as a model for time point 
spacing and observational error distribution for a long and re- 
alistic monitoring programme, as was discussed in Sect. 14.21 In 
the course of experimentation we also performed some calcu- 
lations with the full Schild data set and got unexpected results. 
Assuming that B is a blend, we indeed got a distribution charac- 
teristic to the blended case, shown in Fig. [8] The estimates for 
the time delays in real data are A; = 412, A s - 22 days. The 
amplification factor was held fixed at a = 1.0. (a « 1 .0 is the 
expected value for short A s ). In the current section, the value for 
the downweighting parameter cr was taken 10 days. 

To convince ourselves that the symmetric minima in Fig. [8] 
are not caused by boundary effects of our computational algo- 
rithm, we performed some additional tests. First we used the 
real time moments and error estimates from the same data set 
and built a pair of artificial curves with a given single time de- 
lay between the two curves A; = 412. Then we disturbed both 
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Fig. 9. Merit function values for a random walk and its shifted 
version (A/ = 412 days). Timepoints and standard errors are 
from Rudy Schild's monitoring programme. 
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Fig. 10. Merit function values for a computer generated random 
walk and the blend computed from it using the parameters found 
from the real observational data. 



curves by appropriately scaled random errors. The resulting two- 
dimensional slice of the merit function is shown in Fig. [9] It is 
well seen that there is one unique global minimum near the true 
delay value, indicating that we do not have a blend here, the de- 
lay applied is recovered and the estimated short shift value (if 
we assume that the B curve is a blend) is zero. Consequently, 
our method does not generate symmetrically placed minima just 
as an artifact of the procedure. 

Finally we built an artificial blended model with the long 
and short delays found from the real curves A and B. The re- 
sulting grid for the optimal amplification parameter is shown in 
Fig. QJJ From the last simulation we found A/ = 412, A s = 25 
days, indicating that the time delay values found from the real 
data are real. The three-day long estimation error in short shift 
characterises the precision of the algorithm at the given level of 
observational accuracy. 

To check how statistically stable the merit function values 
are for different parameter combinations, we calculated expected 
sums of weights for the time delay grid of Schild data. As it is 
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Fig. 11. The expected sums of weights for the time delay space 
of Schild's data. Zero values represent areas with unphysical 
negative parameter a values as described in the text. 

seen from Fig.Qj] A/ = 412 and A s = 22 days fall into the region 
of higher weights and should be considered a reliable result. It 
is currently very difficult to tell why the B curve of the classical 
double quasar behaves as a blend. It is known that there is some- 
thing wrong with the estimated time delays and magnification 
ratios (if optical data is compared with radio data). The pecu- 
liar form of microlensing proposed in Press (1998) can solve 
the problem of magnification ratios. However, it is hard to ex- 
pect that the spacing of microlensing events in time can mimic 
a proper blend. In another development, Goicoechea (2002) sin- 
gles out the different features in the double quasar lightcurves 
which give different values for time delays. As a possible ex- 
planation he uses a quasar model with spatially distant flares, as 
discussed also in Yonehara (1999) and Yonehara et al. ((2003). 
Similar and even more radical ideas can be found in a recent 
work by Schild (2005). Our computations show that not only sin- 
gle events, but the full B curve of the system can be decomposed 
into a sum of two similar and shifted curves. What theoretical 
interpretation can be given to this phenomenon remains an open 
question. 

6. Discussion 

Above, we tried to demonstrate that a very interesting data pro- 
cessing method exists, which allows us to estimate integral time- 
delay systems for blended (not fully resolved) lightcurves. There 
are not yet enough real observed sequences to evaluate the full 
potential of the new method, but our computations allow us to 
get at least a preliminary idea about the applicability of such al- 
gorithms. When planning a new monitoring programme we sug- 
gest to consider the following aspects: 

- The total time base of observations must be determined from 
the expected length of the longest time delays. We cannot 
sensibly find longer time delays from the observed time se- 
ries, than about half of the length of the time series itself. 

- Sampling should be dense enough to match the characteristic 
periods of variability of the source quasar. In more precise 
terms - the sums of weights for every parameter combination 
to be compared must be large enough to avoid statistically 
unstable merit function values. 
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- Sampling determines the shortest possible delay we are able 
to find from the particular data. If the downweighting param- 
eter cr (see formula[3]) is too small for a given sampling, we 
will have too few pairs in the calculation of the merit func- 
tion and the map of D 2 will be poor due to noise and bound- 
ary effects. On the other hand, enlargement of cr is limited 
because of the smoothing effect of this parameter - using 
larger cr reduces the possibility of finding shorter time de- 
lays. As a rule of thumb cr should be kept equal to or smaller 
than half of smallest possible time delay A s we are trying 
to find. For sound statistics, we should have on the average 
at least 3-5 pairs for every observed time point when com- 
bining and subtracting the time series. In practice it would 
be useful to carry on computations with varying values of cr 
to check statistical stability and robustness. See for instance 
Pelt (1 19961 1 where such an analysis was used for a simple 
case of delay estimation. 

- A sufficiently dense and truly random distribution of the time 
points is best for the method described above. The strongest 
source of uncertainty for the described algorithms is period- 
icity of data gaps. Unfortunately, ground-based astronomi- 
cal observations tend to be of this kind. To lessen the effect 
of gaps, combined observations from different sites can be 
used. 

- As we saw in Sect. 14.21 observational errors for typical ex- 
perimental setups must be kept under 5% of the amplitude of 
lightcurve. 

- As a sanity check, it is worth to compute merit function val- 
ues for a larger parameter grid. Then the overall pattern of 
symmetrically shaped and mirrored minima allow us to get 
the general impression of the validity of our solutions. 

The software modules we have developed can be used to 
model situations that can occur in real long-time monitoring pro- 
grammes. By varying model parameters we can estimate suffi- 
cient durations for observational sessions and also the accuracy 
of observation needed. Unfortunately, even accurately planned 
sessions can result in a failure because the source quasars them- 
selves can show persistent stationarity or the time series ob- 
served can be contaminated by strong microlensing. It is not 
ruled out that the proposed method can be used in absolutely 
different contexts. One of the applications we are presently con- 
sidering is disentangling echo components in certain high energy 
events of cataclysmic stars. 
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7. Conclusion 

A method of finding time delays from photometry of unresolved 
gravitationally lensed quasar images has been developed and 
tested using simulated and real data sets. It can be used to anal- 
yse already existing data sets and also to plan new observa- 
tions. We encourage observers to run long time monitoring pro- 
grammes of blended images to reveal time delays from obtained 
data. 
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